Communications Medicine
○ Springer Science and Business Media LLC
Preprints posted in the last 7 days, ranked by how well they match Communications Medicine's content profile, based on 85 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.
Gong, L.; Aswani, N.; Shahinian, P.; Yang, J. Y.; Kontos, D.; Manji, G.; Kang, S.; Hur, C.
Show abstract
Electronic health record (EHR) prediction models often summarize longitudinal histories as static patient-level features, which may omit potentially informative event ordering. We developed a simplified spike-timing-dependent plasticity (STDP)-inspired framework that represents asynchronous EHR data as sparse, directional transition features. The approach encodes whether one clinical event precedes another within prespecified temporal windows, preserving event identity, directionality, and approximate timing while retaining feature-level interpretability. We evaluated this framework in two retrospective prediction tasks with different temporal scales: incident acute kidney injury (AKI) prediction in 17,351 MIMIC-IV ICU stays and early postoperative recurrence prediction in 713 CUMC patients with pancreatic ductal adenocarcinoma (PDAC). Models were compared with static burden features (demographics, comorbidities, raw lab measurements) and in addition with STDP transitional feature sets using patient-level cross-validation and rolling prediction horizons. In AKI, a calibrated STDP ensemble model showed higher discrimination than static burden alone at the 24-hour decision snapshot for AKI by 72 hours, with AUROC 0.838 versus 0.800, and at 48 hours for near-term AKI prediction, with AUROC 0.868 versus 0.827. In PDAC, STDP transition features modestly improved Day -30 preoperative recurrence prediction, with AUROC 0.611 versus 0.587 and AUPRC 0.323 versus 0.318 for static burden and showed similar performance at Day 0 (7 days before recorded surgery date), with AUROC 0.681 and AUPRC 0.363. Decision-curve and feature analyses suggested that selected temporal transitions were clinically interpretable across renal, inflammatory, hepatobiliary, hematologic, glycemic, and nutritional trajectories. These findings suggest that STDP-inspired transition features may provide a practical, interpretable way to incorporate temporal ordering into EHR-based risk prediction across both acute and longitudinal settings
Saad, A. A.; Murthi, S. B.; Boctor, E. M.; Teeter, W. A.; Seam, N.
Show abstract
The increasing availability of portable ultrasound systems motivates exploration of novel approaches to respiratory signal assessment. In this in-vitro study, we investigate whether pulsed-wave (PW) Doppler ultrasound can capture structured spectral patterns from replayed lung sound recordings. Digitized respiratory sounds were replayed through a tissue-mimicking ultrasound phantom, generating 1,478 PW Doppler spectral images from recordings associated with healthy subjects and several externally labeled disease categories. Exploratory classification experiments using a ResNet-18 architecture demonstrated that these Doppler representations contain learnable differences under controlled conditions. These findings motivate further investigation into PW Doppler as a potential representation of respiratory acoustics.
Fieggen, J.; Simond, G.; Segal, B. M.; Noori, A.; Thakurta, A.; Butler, C. C.; Clifton, D. A.; Clifton, L.
Show abstract
Background. Blood-based biomarkers are increasingly proposed for identifying high-risk individuals before clinical disease and for making prevention-oriented trials more efficient. Prognostic enrichment can increase event rates, but trial efficiency also depends on whether the intervention effect is preserved in the enriched population. Methods. Using the UK Biobank Pharma Proteomics Project, we trained disease-specific proteomic risk scores (ProRS) from 2,916 plasma proteins with elastic-net Cox models. We compared ProRS, polygenic risk scores (PRS), and combined PRS--ProRS scores across ten incident diseases. We estimated cumulative incidence and theoretical two-arm time-to-event trial sample sizes across risk strata. To evaluate effect preservation, we examined six intervention-analogue exposure--outcome pairs spanning genetic (PCSK9/coronary artery disease, APOE/Alzheimer's disease, PPARG/type 2 diabetes, IL23R/Crohn's disease), behavioural (physical activity/all-cause mortality), and pharmacological (RAAS inhibitors versus calcium channel blockers/coronary artery disease) examples. Results. ProRS outperformed PRS for 9 of 10 diseases (median C-index 0.75 versus 0.61). ProRS and PRS were weakly correlated (median Pearson |r| = 0.04), and joint PRS--ProRS stratification identified groups with higher observed incidence than either score alone for several endpoints. In the top risk quartile, combined-score enrichment reduced theoretical required sample sizes by 32--74\% under a fixed 20\% relative hazard reduction. These gains were not always preserved when stratum-specific intervention-analogue effects were used. Effects were broadly preserved for APOE/Alzheimer's disease and physical activity/mortality. The PPARG/type 2 diabetes effect attenuated toward the null under all three score types, showing that event-rate enrichment does not guarantee effect preservation. For IL23R/Crohn's disease and the antihypertensive comparison, point estimates differed across score types -- preserved under polygenic but attenuated under proteomic enrichment -- but confidence intervals were wide and overlapping. Conclusions. Proteomic risk scores can identify high-event-rate populations for prevention-oriented trials, but event-rate enrichment alone is insufficient for trial design. Biomarker-guided enrichment should evaluate mechanism-specific effect preservation and may be preferable as a stratification or adaptive-design variable rather than as a restrictive eligibility criterion.
Schmidt, P.; Preskorn, S.
Show abstract
In February 2026, the FDA announced that a single pivotal phase 3 (P3) trial would become the new default standard for drug approval - a regulatory direction that had been legally enabled since the FDA Modernization Act of 1997. This announcement has strategic, scientific, and economic implications for drug developers, contract research organizations (CROs), and biotech investors. We argue that the expansion of this framework, originally reserved for various niche submissions, represents a paradigm change, dramatically increasing the value of rigorous early phase (P1 and P2) trial design, requiring sponsors to establish both statistical efficacy signals and mechanistic biological understanding before entering phase 3. Using a CNS indication cost model, we show that single P3 approval can reduce total development expenditure from approximately $447 million over 14 years to $297 million over 12 years - a savings of $150 million and providing two years of additional commercial runway for a modeled CNS drug. Case examples including lecanemab, omaveloxolone, and tofersen illustrate how biomarker-informed early phase strategies can establish the confirmatory evidence necessary for single-trial approval. We provide practical guidance for maximizing the value of P1 and P2 under this evolving framework.
Knudson, K. C.; Anderson, K. M.; Ballard, M.; Lenz, R. A.; Dam, T.; Sagman, D.; Brandon, N. J.; Banerjee, T.; Jaffe, A. E.
Show abstract
High placebo response is an obstacle in developing drugs to treat agitation in Alzheimer's disease (AAD), a prevalent and burdensome symptom. However, it has proved challenging to develop actionable models of placebo response that 1) can be applied prospectively, requiring only information available at screening or baseline, 2) yield strategies for reducing placebo response without equally depressing drug response, and 3) show generalizability across trials. Here, we first investigated placebo response in AAD at the trial level using meta-regression applied to 23 clinical trials. Meta-regression identified several factors associated with increased placebo response, but most of these factors were non-specific such that they predicted improvements in drug response as well. We therefore turned to individual level clinical trial datasets and applied causal modeling to predict which participants would have high placebo response relative to predicted drug response. We successfully built and validated the causal model across two independent clinical trials of risperidone and haloperidol at the level of individual patients (ability to predict subsequent improvement on drug or placebo). Crucially, we also found efficacy improvements in the overall trial through in silico exclusion/screen failing of high placebo-predicted subjects. We further characterized features most associated with placebo response to improve explainability and, lastly, validated the effect of these features at the trial level in clinical trials of galantamine, an acetylcholinesterase inhibitor (hence in a different class of drugs than those in the other two trials used). Taken together, we have developed and applied a causal modeling framework for reducing placebo response and increasing trial-level efficacy in neuropsychiatry clinical trials using historical trial datasets.
Stujenske, T. M.; Bouchard, T. P.; Troy, A.; Kelemen, S.; Folino, B.; Wills, T.; Sugden, L. A.
Show abstract
The recent availability of at-home menstrual cycle tracking technology has created opportunities for personalized assessment of reproductive health, alongside improved characterization of hormone patterns in women with and without reproductive disorders such as polyendocrine metabolic ovarian syndrome (PMOS), which affects approximately 10% of reproductive-age women. In this study, we leverage self-tracked urinary hormone data to develop an autoregressive Hidden Markov model (arHMM) that maps cycle days to physiologically meaningful phases based on hormone trajectories. By modeling day-to-day hormonal dynamics rather than absolute hormone levels, and allowing variable phase durations, this approach accommodates substantial variability in menstrual cycles, thereby enabling meaningful comparisons within and between individuals. Across more than 3800 cycles from over 1100 individuals, we find that arHMM-derived phases reproduce expected hormonal patterns within follicular, periovulatory, and luteal phases, and that phase-based timing for hormone testing outperforms conventional cycle day-based testing in capturing the luteinizing hormone surge and post-ovulatory progesterone rise, highlighting limitations of fixed-day clinical protocols. We identify phase-specific differences between healthy controls and individuals with self-reported PMOS, including lower luteinizing hormone in the periovulatory phase, and reduced luteal-phase progesterone levels in PMOS. Furthermore, features derived from arHMM phase assignments enable classification of PMOS status with ~78% accuracy, demonstrating the potential of this approach for non-invasive PMOS screening.
Vanbrabant, E.; Roefs, A.; Goossens, G.; Lemmens, L.; Shapovalova, Y.; Hesen, J.; Mironiuc, C.
Show abstract
Background: Obesity is globally recognized as a complex, multifactorial chronic disease, with biological, psychological, environmental and behavioural factors involved in both disease pathogenesis and maintenance. Although previous group-based studies demonstrated involvement of each of these factors, there is large inter-individual variability in the factors contributing to disease development as well as intervention outcomes, causing limited translatability to the individual level. This heterogeneity in treatment effectiveness might be due to differential causal and maintenance factors of obesity. To enable the transition from a one-size-fits-all approach to a more personalized approach for individuals with overweight or obesity, this study aims to investigate if and how the degree of weight loss and changes in daily life behaviour after a combined lifestyle intervention depend on individual baseline profiles comprising of person characteristics, biological, psychological, environmental and behavioural factors. Methods: This study will include 600 individuals varying in BMI, 200 participants with a healthy BMI (18.5-24.9kg/m2), 200 with overweight (BMI 25.0-29.9kg/m2), and 200 with obesity (BMI [≥]30.0kg/m2). For all participants, a comprehensive individual baseline profile is created, including person characteristics, biological, psychological, environmental and behavioural factors. A clustering method is applied to identify clusters of participants with similar characteristics. Next, we examine if and how these clusters are linked to bodyweight indicators measured at baseline, and how they relate to daily lifestyle behaviour, as measured by ecological momentary assessment (EMA) using a smartphone app and sensor technology (3-week measurements). Individuals with overweight or obesity will be randomized to the intensive lifestyle intervention or a lifestyle information condition, to determine if treatment response can be predicted based on cluster characteristics, how daily lifestyle behaviour changes after an intervention, and how changes in daily lifestyle behaviour relate to treatment response. Discussion: The End of Average study aims to characterize a large set of individuals varying in body weight to predict intervention effectiveness measured as changes in body weight indicators and in daily lifestyle behaviours. If reliable predictors of treatment success can be identified, these can be applied in personalized lifestyle interventions to improve lifestyle behaviour, body weight management and overall health.
Biswas, M. A.; Laila, A.
Show abstract
Background: Machine learning models trained on population health surveys offer scalable tools for cardiovascular screening, but recurring methodological weaknesses undermine their credibility and equity: data leakage from synthetic oversampling, qualitative rather than quantitative explainability evaluation, and the absence of demographic fairness auditing at the clinical operating threshold. Methods: We present EXHEART, a leakage-free stacked ensemble pipeline trained on BRFSS 2015 (n = 253,680) and validated on BRFSS 2020 (n = 319,795; temporal transport and retrain) and a clinical cardiovascular examination dataset (n = 68,730). The pipeline combines XGBoost, LightGBM, Random Forest, and a multi-layer perceptron as base learners with 5-fold out-of-fold logistic regression stacking and Platt scaling calibration. A quantitative SHAP-LIME consistency framework, based on Kendall-tau rank correlation and Jaccard overlap, accompanies a decision-curve analysis, a subgroup-stratified SHAP interaction analysis, and an intersectional fairness audit (Sex x Age x Income) with threshold-shifting mitigation and a frontier of the fairness-utility trade-off. The framework also adds cross-instrument fairness-disparity attribution, an empirical diagnostic that provides evidence on whether an observed subgroup disparity is more consistent with a measurement-induced or a substantive explanation by re-validating it on a dataset that measures the same clinical construct objectively. On heart disease, this diagnostic associates 89% of the sex TPR gap (95% CI [0.65, 0.99]) with the self-reported survey outcome rather than with a substantive risk difference. Results: On BRFSS 2015, EXHEART achieves AUC-ROC = 0.850, AUPRC = 0.371, Brier score = 0.071, and reduces ECE by 96% (0.256 to 0.011) via Platt scaling. Global SHAP-LIME rank agreement is moderate-to-strong (Kendall-tau = 0.580, Spearman-rho = 0.818) with a substantial top-3 divergence (Jaccard@3 = 0.200), where Stroke flips from SHAP rank 8 to LIME rank 1. The Sex TPR gap is 0.124 at the screening threshold; intersectional Sex x Age disparities reach 0.649 among adequately-powered cells, 5.2x the single-attribute gap. Temporal transport to BRFSS 2020 collapses sensitivity from 0.776 to 0.267, while retraining restores AUC = 0.840 and ECE = 0.012. On clinical examination data, the Sex TPR gap collapses to 0.014; the attribution test indicates this gap is instrument-dependent, consistent with a measurement or outcome-definition explanation rather than a substantive risk difference. Cross-domain SHAP analysis identifies four instrument-independent CVD risk factors and two major portability failures. Conclusions: EXHEART combines three practices that population-scale cardiovascular classifiers usually apply in isolation: leakage-free training with calibrated probabilities, a test of whether the model's explanations are stable, and a fairness audit that examines intersecting subgroups rather than single attributes. Bringing them together proved worthwhile. The intersectional audit revealed disparities that single-attribute auditing missed, and the cross-instrument comparison indicated that much of the sex gap reflects how the outcome is measured in survey data rather than a substantive difference in risk. The temporal transport findings indicate that deployed BRFSS models warrant periodic monitoring and retraining to maintain clinical utility. EXHEART is a retrospective methodological evaluation on public de-identified data; it is not validated for direct clinical decision-making, diagnosis, or treatment recommendation without prospective clinical validation.
Kim, D.; Pasco, R.; Johnson, K. E.; Fox, S. J.; Reich, N. G.; Meyers, L. A.
Show abstract
Accurate outbreak forecasts are critical for timely and effective public health response. In the United States, however, most forecasts are produced at the state level, which can mask substantial sub-state heterogeneity and limit their utility for local planning. We generated and evaluated forecasts of the percentage of Emergency Department visits attributable to influenza across 173 large metropolitan Health Service Areas (HSAs) using a gradient boosting quantile regression (GBQR) model, and compared their accuracy to forecasts derived from state-level data alone. At a one-week, two-week and three-week horizon, local forecasts outperformed state-based forecasts in 98.8%, 90.8%, and 78.6% of HSAs, respectively, achieving mean weighted interval scores that were on average a 39.2% lower (95% range: 5.9% to 76.7%), 19.6% lower (-6.3% to 59.5%) , and 11.4% lower (-11.7% to 44.9%), respectively. The performance advantage of local forecasting was strongest in HSAs representing a smaller share of their state's population and increased with the proportion of the HSA population living in urban areas and the number of metropolitan areas within a state. These results, based on an analysis of HSAs with populations greater than 250,000, demonstrate that fine-scale modeling can substantially improve forecast accuracy and highlight the potential value of local forecasts for outbreak preparedness and response.
Domian, H. I.; Tian, X.; Ong, D.; Hamilton, L.; Shieh, Y.; Musharoff, S. A.
Show abstract
Background: Polygenic risk scores (PRS) for breast cancer are increasingly used for risk stratification to inform screening and prevention. However, for PRSs to be equitable and clinically useful, they need to perform well across diverse populations. While PRS performance is known to be ancestry-dependent, it is not well understood how environmental context, such as that of socioeconomic status (SES), affects PRS transferability. Here, we assess whether SES, measured via self-reported household income, modifies breast cancer PRS performance and, if so, whether socioeconomic context contributes predictive information beyond genetic risk alone. Methods: We used the US-based All of Us biobank to evaluate how SES impacts breast cancer PRS performance. First, we quantified changes in breast cancer PRS performance by modeling a commonly-cited polygenic score for breast cancer previously described by Mavaddat et al. with SES. We then reestimated the genetic effect sizes of the 3,820 variants from Mavaddat et al. in All of Us with and without income as a covariate. Because social determinants of health affect breast cancer detection and outcomes, we stratified analyses by socially defined populations on the basis of self-identified race and ethnicity. We further stratified individuals whose self-identified race is White (''White'') into three SES groups (high, middle, low) based on self-reported income and re-estimated genetic effect sizes to create SES-specific PRSs. We then applied these PRSs to White participants, the largest group in the study, and to Black or African American (''Black'') and Hispanic or Latino (''Hispanic'') participants, groups underrepresented in breast cancer research. Model discrimination between cases and controls was measured by area under the curve (AUC). Results: We analyzed 163,715 women from the All of Us biobank, which included 8,833 breast cancer cases (6,619 White, 1,178 Black, and 1,036 Hispanic), with relative income available for a subset of these cases (5,525 White, 848 Black, and 566 Hispanic). The ancestry-dependent performance of the breast cancer PRS described in Mavaddat et al. was replicated in All of Us. In Black individuals, this PRS (AUC and 95% CI: 0.576 [0.571, 0.582]) produced a similar increase in AUC as relative income (AUC: 0.573 [0.568, 0.577]) when added to an age-only model. Incorporating income with PRS, age, and genetic PCs 1-3 improved AUC by 0.007 in White Americans and 0.018 in Black Americans (both p < 10-11), while attenuating the contribution of PRS in the full model. PRS performance also varied among SES categories. Notably, PRSs with variant effect sizes that were recalibrated in low-SES White participants performed best in low-SES White participants (AUC: 0.605 [0.583, 0.628]) and Black Americans (AUC: 0.588 [0.586, 0.591]), both better than performance in high-SES White Americans (AUC: 0.579 [0.577, 0.580]) and middle-SES White Americans (AUC: 0.578 [0.569, 0.586]). Conclusion: Socioeconomic context, measured by income, significantly impacts the transferability of a PRS for breast cancer within and among groups defined by self-identified race and ethnicity. Accounting for SES improves PRS performance, most notably in Black Americans and low-SES White individuals.
Corona-Moreno, R.; Acuna-Zegarra, M. A.; Santana-Cibrian, M.; Velasco-Hernandez, J. X.
Show abstract
During the COVID-19 pandemic, limited testing capacity and reporting delays complicated epidemic surveillance and decision-making in Mexico. We calibrated \textit{covidestim}, a Bayesian nowcasting model, to estimate the total SARS-CoV-2 infections from reported cases and deaths using Mexican surveillance data. Disease-progression distribution priors were calibrated using Mexico City records and validated through comparisons with national seroprevalence surveys, hospitalization data, and annual reported severe-case rates across all states. Using the reconstructed estimates of active infections, we implemented an event-based risk framework that quantifies the probability of encountering at least one infectious individual in gatherings of different sizes. This probability was subsequently translated into a four-level epidemiological traffic-light indicator and computed at both state and municipality levels. The resulting estimates revealed substantial spatial heterogeneity that is obscured by state-level aggregation, particularly in states with marked differences between urban and rural municipalities. To evaluate consistency with public-health indicators, we compared the proposed risk classification with the official Mexican epidemiological traffic-light system, considering interpretable gathering sizes relevant to public-health decision making. Weekly reports derived from this framework were delivered to policymakers in the State of Queretaro in Mexico, as an anticipation tool for school reopening and public-space management. This demonstrates that this Bayesian reconstruction of infections combined with event-based risk metrics can provide an interpretable and generalizable municipality-level complement to routine surveillance systems, particularly in regions with limited testing capacity and heterogeneous local transmission dynamics.
Schwoebel, J.; Semenec, I.; Rousseva, J.; Frasch, M. G.; Thorstenson, R.; Bhatt, M.
Show abstract
Large language models embedded in autonomous agents process trusted instructions and untrusted data in one context window, leaving them open to direct and indirect prompt injection. In healthcare this is not hypothetical: a 2025 JAMA Network Open study found commercial medical LLMs followed injected instructions in 94.4% of simulated patient encounters, including life threatening recommendations . Yet the clinically decisive problem we quantify here is different. Most real clinical threats protected health information PHI exfiltration, cross patient access, bulk export, out of scope advice are fluent, legitimate looking requests that carry no attack signal, so even a state of the art injection detector passes them. Existing runtime guardrails trade safety against latency: model based auditors are accurate but add hundreds of milliseconds of Python inference, while lexical filters are fast but blind to obfuscated or semantically disguised payloads. We present QFIRE, an inline, provider agnostic prompt firewall implemented as a single self contained Rust toolchain proxy, CLI, and benchmark harness. QFIRE combines three mechanisms: (i) positive security scope constraints, which restrict a model call to a declared natural language purpose and block out of scope drift even when no overt attack token is present; (ii) an asynchronous detector graph that runs N rules and their detector nodes concurrently, cheapest checks first; and (iii) a de obfuscation pass that decodes Base64 hex ROT13, folds homoglyphs and leetspeak, and strips zero width characters before detection. QFIRE ships 106 versioned firewall rules and a dedicated HIPAA Safe Harbor 18 identifier PHI panel, and runs a local DeBERTa v3 injection classifier via embedded ONNX Runtime. On 1968 public prompt injection and jailbreak prompts QFIREs deterministic hybrid attains F1 0.86, statistically tied with Metas state of the art PromptGuard 2 0.86 and above protectai DeBERTa v3 0.83; lexical baselines lag 0.16 to 0.50. Our central result is on QFIRE HealthBench, a new 2000 prompt healthcare benchmark we build and release with real garak and Microsoft PyRIT payloads. There the same PromptGuard-2 recovers only 0.40 recall DeBERTa v3 0.57, because most clinical threats carry no injection signal; QFIREs combined scope plus PHI chain reaches 0.83 recall F1 0.87 at a calibrated 0.08 false positive rate. Generic injection detection, even state of the art, is therefore necessary but not sufficient for healthcare agents. A bare LLM judge also closes most of this static corpus gap F1 0.90; QFIREs contribution beyond static accuracy is auditable determinism, bounded latency, and adaptive robustness, where the bare judge falls to 34 to 59% recall section 5.5. End to end, placing QFIRE in front of a tool using agent over a mock EHR sandbox cuts the agents harmful action rate from 0.38 to 0.00 at a 0.13 benign utility cost. All code, rules, corpora snapshots, and scripts are released, and every table regenerates from a single make paper target against local models with no paid API keys.
Taylor, A. R.; Foo, Y. S.; White, M. T.
Show abstract
Background: Reliable inference of Plasmodium vivax recurrence states - relapse, recrudescence and reinfection (the ``3Rs'') - improves estimates of antimalarial efficacy. The R package Pv3Rs features a Bayesian model designed for P. vivax molecular correction, i.e., using parasite genetic data to infer recurrence states. The model is an extension of a prototype built to analyse microsatellite data from the Vivax History (VHX) and Best Primaquine Dose (BPD) trials. Methods: We re-analysed data from 212 VHX and BPD trial participants (493 recurrences) using Pv3Rs, comparing results with those from the prototype and with genetic relatedness estimated using Dcifer, a tool for estimating relatedness based on identity-by-descent. Posterior recurrence state probabilities were computed using both uniform and time-to-event priors: artificial but equal prior probabilities facilitate posterior interpretation, while time-to-event priors leverage all available information and enable re-computation of failure rates. Relatedness estimates were used to identify and correct instances of model misspecification. Results: The Pv3Rs model generated posterior probabilities for all recurrences and was able to jointly model data on all episodes per participant for 89% of participants, compared with 73% using the prototype. Recurrence state probabilities were broadly consistent across methods, though the Pv3Rs model elevated reinfection probabilities slightly. Relatedness estimates exposed various outliers consistent with half-sibling parasites and/or genotyping errors. Outlier correction impacted some per-participant failure probabilities, but reinfection-adjusted radical-cure failure rates of high-dose primaquine remained near 3%, in line with previous findings. Conclusion: Re-analysis of VHX and BPD P. vivax genetic data restates earlier reinfection-adjusted efficacy estimates. It demonstrates the increased computational capability and misspecification sensitivity of Pv3Rs, highlighting a need for careful analyses. Using relatedness-based diagnostics alongside model-based inference, we were able to harness the advantages of model-based inference and provide a framework for future P. vivax molecular correction.
Overmars, L. M.; Allaart, C.; Bron, E. E.; Brunner La Rocca, H.-P.; de Bresser, J.; Muller, M.; van Osch, M. J. P.; Teunissen, C.; Tijms, B. M.; Wolters, F. J.; Biessels, G. J.; Heart-Brain Connection Consortium,
Show abstract
Background: Vascular cognitive impairment (VCI) and small vessel disease (SVD) involve many interconnected factors influencing multiple outcomes, also beyond cognitive decline. Bayesian networks (BNs) can help unravel these complex interrelations, which we demonstrate in this proof-of-concept study in the Heart-Brain Connection cohort, including memory-clinic patients with SVD, patients with heart failure, carotid occlusive disease, and reference participants. Methods: We trained BNs and jointly modelled cognitive decline (Clinical Dementia Rating (CDR) increase) and major adverse cardiovascular events (MACE) over five years as outcomes in relation to multiple demographic and disease factors and emerging imaging and plasma biomarkers, also considering possible non-random dropout. Results: Of 566 individuals (median age 68, 64% men), 134 had MACE and 112 experienced CDR increase. Diagnostic group and baseline cognition were key determinants of both outcomes. The BN identified baseline clinical severity as a non-random dropout source. Plasma biomarkers formed an interconnected subnetwork, linked to demographic and vascular factors, but without direct dependencies with outcomes. The trained BN also provides individualized inference under partial evidence, informing on outcome probabilities. Conclusion: This proof-of-concept study demonstrates how BNs quantify and visualize the dependency structure underlying prognostic heterogeneity in VCI and SVD, including non-random dropout and positioning of emerging biomarkers.
Gobeil, E.; Bourgault, J.; Enault, M.; Cote, V.; Mitchell, P. L.; Ruel, L.-J.; Girard, A. S.; Vohl, M.-C.; Arsenault, B. J.
Show abstract
Metabolic dysfunction-associated steatotic liver disease (MASLD) is rapidly increasing worldwide, yet effective targeted therapies remain limited. To better understand the molecular mechanisms underlying MASLD, we performed an integrated proteogenomic analysis of human liver tissue. Using mass spectrometry, we quantified 2,744 proteins in 504 liver biopsies from the Quebec Obesity Biobank and examined changes across disease stages. To investigate causality, we integrated liver proteomics with RNA sequencing and genome-wide genotyping to map thousands of protein quantitative trait loci (pQTLs) and expression quantitative trait loci (eQTLs). These molecular data were combined with summary statistics from a meta-analysis of genome-wide association studies including 16,532 MASLD cases and 1,240,188 controls. Mendelian randomization and genetic colocalization analyses revealed that most proteins differentially expressed across MASLD stages were not causally implicated in disease risk, whereas several genetically predicted liver proteins showed evidence of causal effects. Among these, higher hepatic levels of the MTARC1 protein were causally associated with MASLD and hepatic fat accumulation. Phenome-wide analyses suggested that MTARC1 inhibition may reduce the risk of cirrhosis, hepatocellular carcinoma, and cholelithiasis while improving lipid profiles. Notably, the causal MTARC1 variant influenced liver protein levels but not gene expression. Genetic analyses also identified ERLIN1 and HSD17B13 as potential therapeutic targets. In contrast, eQTLs and pQTLs at other loci such as GCKR showed opposite effects on MASLD risk. These findings highlight the importance of integrating tissue proteomics with human genetics to distinguish biomarkers from causal drivers and to identify promising therapeutic targets for MASLD.
Lange, B. K. A.; Graceffo, E.; Stenzel, W.; Biebermann, H.; Schuelke, M.; Wilpert, N.-M.
Show abstract
Gene therapy is rapidly emerging as a transformative treatment for monogenic neurological disorders, including pediatric movement disorders such as aromatic L-amino acid decarboxylase (AADC) deficiency. However, its success critically depends on defining target cells and windows for therapeutic intervention. Here, we present an open-access single-nucleus transcriptomic atlas of the human basal ganglia spanning a therapy-relevant window from second/third trimester to the perinatal period and adulthood. Across 35,755 nuclei, we identify major (non-)neuronal cell types, retrace developmental trajectories, and characterize gene-regulatory networks. We identify so far unrecognized human-specific expression of key neuronal signaling genes, including GNAO1 and ADCY5, and discuss the implications for targeted gene replacement therapies. Unexpectedly, we found that the Huntingtin gene (HTT) is already expressed during prenatal stages of human brain development, supporting a previously proposed neurodevelopmental component of Huntington's disease, which should be considered in diagnostic and therapeutic strategies. Moreover, FOXG1 expression and regulon activity are predominantly located in a prenatal time window, suggesting constraints on the effectiveness of postnatal interventions. Our findings highlight the importance of datasets capturing human brain development in real time and provide a publicly available resource to guide precision gene therapy strategies in the future.
Parisien-La Salle, S.; Tsai, C. H.; Newman, A. J.; Heydarpour, M.; Mahrokhian, S.; Hanna, I.; Brown, J. M.; Waikar, S.; Moussa, M.; Vaidya, A.
Show abstract
Background: Pathologic aldosteronism induces oxidative stress, tissue injury, and increases in hemoglobin. Conversely, aldosterone antagonist therapy decreases hemoglobin. Whether these effects are attributable to aldosterone-mediated changes in iron and oxygen metabolism is unknown. Methods: The plasma proteome of participants with overt primary aldosteronism (PA) (n=50) was compared with participants without overt PA (n=61). To isolate aldosterone-dependent effects, participants without overt PA underwent oral sodium suppression testing to quantify the magnitude of renin-independent aldosterone production, enabling monotonic dose-response analyses across the continuum of renin-independent aldosteronism (subclinical to overt PA). Differential abundance testing was performed using empirical Bayes linear modeling, followed by Reactome pathway enrichment analysis and covariate-adjusted sensitivity analyses. To validate clinical relevance, aldosterone dose-response trends with blood count parameters were examined in this cohort, and an independent population-based cohort of 5,713 people with hypertension. Results: 903 proteins in the peripheral circulation were differentially abundant in overt PA versus participants without PA. The most significantly increased protein in overt PA was CYBRD1, involved in iron reduction and absorption. Pathway enrichment identified 16 iron- and heme-related pathways, including erythropoietin signaling, heme biosynthesis and mitochondrial iron-sulfur cluster biogenesis, with increases in heme and erythroid proteins and decreases in mitochondrial iron-sulfur proteins. Linear aldosterone dose-dependent trend analyses across the PA continuum further supported this signature, identifying progressive increases in hemoglobin subunits (HBA1/HBB), heme-related proteins (HMBS, UROS, AMBP, HPX, GLO1) and erythrocyte oxygen handling enzymes (CA1/CA3), alongside progressive reductions in mitochondrial electron transport chain subunits (CYCS, ETFA). These proteomic changes corresponded with aldosterone dose-dependent increases in red blood cell count, hemoglobin, and hematocrit, in this cohort and another population-based cohort. Conclusion: The continuum of PA is characterized by a progressive shift away from mitochondrial oxidative phosphorylation and toward increased intestinal iron absorption, preferential iron transport over storage, and enhanced heme synthesis and recycling, possibly reflecting cellular pseudohypoxia and systemic adaptations to increase oxygen delivery. These findings provide a novel mechanistic basis for aldosterone-mediated tissue injury and the benefits of aldosterone-directed therapy.
Nagori, A.; Singh, P.; Firdos, S.; Devadiga, A.; Vats, V.; Gupta, A.; Bandhey, H.; Ailavadi, P.; Awasthi, R.; Narotam, N.; Mishra, A.; Lodha, R.; Sethi, T.
Show abstract
High-frequency physiological monitoring in ICUs can identify impending deterioration hours before clinical recognition yet extracting reliable early-warning signals from noisy vital-sign streams remains challenging. We present SIgnose, an interpretable prediction framework for early detection of abnormal shock index (SI), built from routinely monitored vital signs using physiologic variability and nonlinear time-series features. SIgnose was developed on the eICU Collaborative Research Database and externally validated on the MIMIC-III adult database and a pediatric SafeICU cohort (AIIMS New Delhi), with additional prospective validation in the pediatric ICU. We benchmarked three representation strategies: (i) engineered physiologic variability and nonlinear time-series features, (ii) deep learning, and (iii) Llama-3.1-8B embeddings with low-rank adaptation. Physiologic variability features consistently demonstrated superior cross-cohort generalization. The final model used 3,970 features from five vital signs to predict abnormal SI up to 8 hours ahead, achieving AUROC 0.861 (95% CI 0.859-0.863) and AUPRC 0.927 (95% CI 0.925-0.929) on eICU. External validation yielded AUROC 0.870 (95% CI 0.863-0.876) and AUPRC 0.935 (95% CI 0.930-0.940) on MIMIC-III, and AUROC 0.875 (95% CI 0.863-0.888) and AUPRC 0.915 (95% CI 0.898-0.930) on SafeICU; prospective pediatric validation (n = 88) achieved AUROC 0.885 (95% CI 0.868-0.902) and AUPRC 0.911 (95% CI 0.882-0.936). SHAP interpretability analysis identified heart rate variability, respiratory trend dynamics, and multi-scale blood pressure variability as key early-warning signatures. These findings establish SIgnose as a reproducible, low-compute, early-warning framework and demonstrate that physiologic variability features provide robust, generalizable representations for early deterioration detection across adult and pediatric critical care.
Spencer, G. M.; Karim, K.; Dzioba, A.; Graham, M. E.; You, P.; Hummel, T.; Gellrich, J.; Coyle, P.; Burns, H.; Peer, S.; Zawawi, F.; Lechien, J. R.; Schriever, V. A.; Bhargava, E. K.; Whitcroft, K. L.
Show abstract
Background: Olfactory dysfunction (OD) in children remains underdiagnosed and poorly characterised. Despite its known impacts on nutrition, quality of life, safety awareness, and psychosocial development, no standardised diagnostic or management pathway currently exists for paediatric OD. This study aimed to characterise global practice patterns and identify diagnostic and therapeutic challenges unique to paediatric care. Methodology/Principal: A 44-item cross-sectional online survey was distributed to a verified international network of paediatric otolaryngologists across 36 countries via a closed professional platform. The survey assessed five domains: diagnostic practices, management protocols, technology and innovation, education and training, and barriers to effective care. Regional grouping was used to facilitate meaningful statistical comparisons. Categorical variables were evaluated using chi-square tests, with odds ratios and 95% confidence intervals reported for significant findings. Results: Of 351 potential participants, 167 responded (47.6% response rate). Most respondents (83%) reported seeing children with OD, yet 95% saw fewer than ten such patients annually. Psychophysical testing was never performed by 54.8% of respondents, while 88.4% routinely ordered cross-sectional imaging. Testing frequency increased significantly with patient age (Cochran's Q p<0.001). The most common barriers to objective testing were insufficient training (44.3%), time constraints (29.9%), and funding limitations (28.1%). Multidisciplinary collaboration was negligible. Significant regional variation was observed across most practice domains. Conclusions: Paediatric OD care is characterised by functional underinvestigation, fragmented multidisciplinary collaboration, and systemic educational gaps. These findings support urgent development of standardised clinical guidelines, age-appropriate validated assessment tools, and formal interdisciplinary care pathways.
Ahmed, M.; Ahmed, F.; Mow, S. M.; Taha, P. A.; Barua, S.; Rahman, M. M.; Rafy, A.; Mondol, S. M.; Faisal, M. I.
Show abstract
Post-surgical adverse outcomes, including mortality, intensive care readmission, and complications, remain major challenges for clinical decision-making. Existing machine learning approaches focus on outcome prediction while operating as opaque systems, limiting clinical trust and the translation of predictions into treatment decisions, and many clinical studies rely on synthetic data in which shared intermediate variables create circular dependencies between inputs and targets that compromise reported performance. We aimed to develop an explainable multimodal architecture and a rigorous evaluation methodology that address these gaps. We designed a two-stage architecture integrating supervised deep learning for risk prediction with conservative Q-learning for action recommendation. The first stage uses five modality-specific encoders for structured records, physiological time-series, chest radiographs, clinical notes, and surgical metadata, unified through cross-modal attention into a shared patient-state representation. The second stage applies offline reinforcement learning to recommend clinical actions while preventing value overestimation. We formally characterized a target-leakage flaw in synthetic pipelines and propose a real-data methodology using a verified clinical database, with event-censored temporal separation and uncertainty-weighted per-task training. Component-level behavior was validated on a controlled synthetic benchmark, demonstrating that the architecture functions as designed without claiming clinical validity. The cross-modal attention and risk-prediction components behaved as expected, whereas the offline reinforcement learning stage did not converge on the benchmark, indicating that value estimation requires further investigation on real clinical data. The architecture provides dual-level explainability through attention visualization and value decomposition, contributing a deployable design, a formal methodological critique of synthetic-data practices, and a complete framework for clinically valid evaluation.